Method and apparatus for synchronizing applications for data recovery using storage based journaling

ABSTRACT

Disclosed is a method to synchronize the state of an application and an application&#39;s objects with data stored on the storage system. The storage system provides API&#39;s to create special data, called a marker journal, and stores it on a journal volume. The marker contains application information, e.g. file name, operation on the file, timestamp, etc. Since the journal volume contains markers as well as any changed data in the chronological order, IO activities to the storage system and application activities can be synchronized.

CROSS-REFERENCES TO RELATED APPLICATIONS

This application is related to the following commonly owned andco-pending U.S. applications:

-   -   “Method and Apparatus for Data Recovery Using Storage Based        Journaling,” Attorney Docket Number 16869B-082700US, and    -   “Method and Apparatus for Data Recovery Using Storage Based        Journaling,” Attorney Docket Number 16869B-082800US,        both of which are herein incorporated by reference for all        purposes.

BACKGROUND OF THE INVENTION

The present invention is related to computer storage and in particularto the recovery of data.

Several methods are conventionally used to prevent the loss of data.Typically, data is backed up in a periodic manner (e.g., once a day) bya system administrator. Many systems are commercially available whichprovide backup and recovery of data; e.g., Veritas NetBackup,Legato/Networker, and so on. Another technique is known as volumeshadowing. This technique produces a mirror image of data onto asecondary storage system as it is being written to the primary storagesystem.

Journaling is a backup and restore technique commonly used in databasesystems. An image of the data to be backed up is taken. Then, as changesare made to the data, a journal of the changes is maintained. Recoveryof data is accomplished by applying the journal to an appropriate imageto recover data at any point in time. Typical database systems, such asOracle, can perform journaling.

Except for database systems, however, there are no ways to recover dataat any point in time. Even for database systems, applying a journaltakes time since the procedure includes:

-   -   reading the journal data from storage (e.g., disk)    -   the journal must be analyzed to determine at where in the        journal the desired data can be found    -   apply the journal data to a suitable image of the data to        reproduce the activities performed on the data—this usually        involves accessing the image, and writing out data as the        journal is applied

Recovering data at any point in time addresses the following types ofadministrative requirements. For example, a typical request might be, “Ideleted a file by mistake at around 10:00 am yesterday. I have torecover the file just before it was deleted.”

If the data is not in a database system, this kind of request cannot beconveniently, if at all, serviced. A need therefore exists forprocessing data in a manner that facilitates recovery of lost data. Aneed exists for being able to provide data processing that facilitatesdata recovery in user environments other than in a database application.

SUMMARY OF THE INVENTION

In accordance with an aspect of the present invention, a storage systemexposes an application programmer's interface (API) for applicationsprogram running on a host. The API allows execution of program code tocreate marker journal entries. The API also provides for retrieval ofmarker journals, and recovery operations. Another aspect of theinvention, is the monitoring of operations being performed on a datastore and the creation of marker journal entries upon detection one ormore predetermined operations. Still another aspect of the invention isthe retrieval of marker journal entries to facilitate recovery of adesired data state.

BRIEF DESCRIPTION OF THE DRAWINGS

Aspects, advantages and novel features of the present invention willbecome apparent from the following description of the inventionpresented in conjunction with the accompanying drawings:

FIG. 1 is a high level generalized block diagram of an illustrativeembodiment of the present invention;

FIG. 2 is a generalized illustration of a illustrative embodiment of adata structure for storing journal entries in accordance with thepresent invention;

FIG. 3 is a generalized illustration of an illustrative embodiment of adata structure for managing the snapshot volumes and the journal entryvolumes in accordance with the present invention;

FIG. 4 is a high level flow diagram highlighting the processing betweenthe recovery manager and the controller in the storage system;

FIG. 5 illustrates the relationship between a snapshot and a pluralityof journal entries;

FIG. 5A illustrates the relationship among a plurality of snapshots anda plurality of journal entries;

FIG. 6 is a high level illustration of the data flow when an overflowcondition arises;

FIG. 7 is a high level flow chart highlighting an aspect of thecontroller in the storage system to handle an overflow condition;

FIG. 7A illustrates an alternative to a processing step shown in FIG. 7;

FIG. 8 illustrates the use of marker journal entries;

FIG. 9 shows a SCSI-based implementation of the embodiment shown in FIG.8;

FIG. 10 shows a block diagram of the API's according to another aspectof the invention; and

FIG. 11 is a flowchart highlighting the steps for a recovery operation.

DESCRIPTION OF THE SPECIFIC EMBODIMENTS

FIG. 1 is a high level generalized block diagram of an illustrativeembodiment of a backup and recovery system according to the presentinvention. When the system is activated, a snapshot is taken forproduction data volumes (DVOL) 101. The term “snapshot” in this contextconventionally refers to a data image of at the data volume at a givenpoint in time. Depending on system requirements, implementation, and soon, the snapshot can be of the entire data volume, or some portion orportions of the data volume(s). During the normal course of operation ofthe system in accordance with the invention, a journal entry is made forevery write operation issued from the host to the data volumes. As willbe discussed below, by applying a series of journal entries to anappropriate snapshot, data can be recovered at any point in time.

The backup and recovery system shown in FIG. 1 includes at least onestorage system 100. Though not shown, one of ordinary skill canappreciate that the storage system includes suitable processor(s),memory, and control circuitry to perform 10 between a host 110 and itsstorage media (e.g., disks). The backup and recovery system alsorequires at least one host 110. A suitable communication path 130 isprovided between the host and the storage system.

The host 110 typically will have one or more user applications (APP) 112executing on it. These applications will read and/or write data tostorage media contained in the data volumes 101 of storage system 100.Thus, applications 112 and the data volumes 101 represent the targetresources to be protected. It can be appreciated that data used by theuser applications can be stored in one or more data volumes.

In accordance with the invention, a journal group (JNLG) 102 is defined.The data volumes 101 are organized into the journal group. In accordancewith the present invention, a journal group is the smallest unit of datavolumes where journaling of the write operations from the host 110 tothe data volumes is guaranteed. The associated journal records the orderof write operations from the host to the data volumes in propersequence. The journal data produced by the journaling activity can bestored in one or more journal volumes (JVOL) 106.

The host 110 also includes a recovery manager (RM) 111. This componentprovides a high level coordination of the backup and recoveryoperations. Additional discussion about the recovery manager will bediscussed below.

The storage system 100 provides a snapshot (SS) 105 of the data volumescomprising a journal group. For example, the snapshot 105 isrepresentative of the data volumes 101 in the journal group 106 at thepoint in time that the snapshot was taken. Conventional methods areknown for producing the snapshot image. One or more snapshot volumes(SVOL) 107 are provided in the storage system which contain the snapshotdata. A snapshot can be contained in one or more snapshot volumes.Though the disclosed embodiment illustrates separate storage componentsfor the journal data and the snapshot data, it can be appreciated thatother implementations can provide a single storage component for storingthe journal data and the snapshot data.

A management table (MT) 108 is provided to store the informationrelating to the journal group 102, the snapshot 105, and the journalvolume(s) 106. FIG. 3 and the accompanying discussion below revealadditional detail about the management table.

A controller component 140 is also provided which coordinates thejournaling of write operations and snapshots of the data volumes, andthe corresponding movement of data among the different storagecomponents 101, 106, 107. It can be appreciated that the controllercomponent is a logical representation of a physical implementation whichmay comprise one or more sub-components distributed within the storagesystem 100.

FIG. 2 shows the data used in an implementation of the journal. When awrite request from the host 110 arrives at the storage system 100, ajournal is generated in response. The journal comprises a Journal Header219 and Journal Data 225. The Journal Header 219 contains informationabout its corresponding Journal Data 225. The Journal Data 225 comprisesthe data (write data) that is the subject of the write operation.

The Journal Header 219 comprises an offset number (JH_OFS) 211. Theoffset number identifies a particular data volume 101 in the journalgroup 102. In this particular implementation, the data volumes areordered as the 0^(th) data volume, the 1^(st) data volume, the 2^(nd)data volume and so on. The offset numbers might be 0, 1, 2, etc.

A starting address in the data volume (identified by the offset number211) to which the write data is to be written is stored to a field inthe Journal Header 219 to contain an address (JH_ADR) 212. For example,the address can be represented as a block number (LBA, Logical BlockAddress).

A field in the Journal Header 219 stores a data length (JH_LEN) 213,which represents the data length of the write data. Typically it isrepresented as a number of blocks.

A field in the Journal Header 219 stores the write time (JH_TIME) 214,which represents the time when the write request arrives at the storagesystem 100. The write time can include the calendar date, hours,minutes, seconds and even milliseconds. This time can be provided by thedisk controller 140 or by the host 110. For example, in a mainframecomputing environment, two or more mainframe hosts share a timer and canprovide the time when a write command is issued.

A sequence number (JH_SEQ) 215 is assigned to each write request. Thesequence number is stored in a field in the Journal Header 219. Everysequence number within a given journal group 102 is unique. The sequencenumber is assigned to a journal entry when it is created.

A journal volume identifier (JH_JVOL) 216 is also stored in the JournalHeader 219. The volume identifier identifies the journal volume 106associated with the Journal Data 225. The identifier is indicative ofthe journal volume containing the Journal Data. It is noted that theJournal Data can be stored in a journal volume that is different fromthe journal volume which contains the Journal Header.

A journal data address (JH_JADR) 217 stored in the Journal Header 219contains the beginning address of the Journal Data 225 in the associatedjournal volume 106 that contains the Journal Data.

FIG. 2 shows that the journal volume 106 comprises two data areas: aJournal Header Area 210 and a Journal Data Area 220. The Journal HeaderArea 210 contains only Journal Headers 219, and Journal Data Area 220contains only Journal Data 225. The Journal Header is a fixed size datastructure. A Journal Header is allocated sequentially from the beginningof the Journal Header Area. This sequential organization corresponds tothe chronological order of the journal entries. As will be discussed,data is provided that points to the first journal entry in the list,which represents the “oldest” journal entry. It is typically necessaryto find the Journal Header 219 for a given sequence number (as stored inthe sequence number field 215) or for a given write time (as stored inthe time field 214).

A journal type field (JH_TYPE) 218 identifies the type of journal entry.The value contained in this field indicates a type of MARKER orINTERNAL. If the type is MARKER, then the journal is a marker journal.The purpose of a MARKER type of journal will be discussed below. If thetype is INTERNAL, then the journal records the data that is the subjectof the write operation issued from the host 110.

Journal Header 219 and Journal Data 225 are contained in chronologicalorder in their respective areas in the journal volume 106. Thus, theorder in which the Journal Header and the Journal Data are stored in thejournal volume is the same order as the assigned sequence number. Aswill be discussed below, an aspect of the present invention is that thejournal information 219, 225 wrap within their respective areas 210,220.

FIG. 3 shows detail about the management table 108 (FIG. 1). In order tomanage the Journal Header Area 210 and Journal Data Area 220, pointersfor each area are needed. As mentioned above, the management tablemaintains configuration information about a journal group 102 and therelationship between the journal group and its associated journalvolume(s) 106 and snapshot image 105.

The management table 300 shown in FIG. 3 illustrates an examplemanagement table and its contents. The management table stores a journalgroup ID (GRID) 310 which identifies a particular journal group 102 in astorage system 100. A journal group name (GRNAME) 311 can also beprovided to identify the journal group with a human recognizableidentifier.

A journal attribute (GRATTR) 312 is associated with the journal group102. In accordance with this particular implementation, two attributesare defined: MASTER and RESTORE. The MASTER attribute indicates thejournal group is being journaled. The RESTORE attribute indicates thatthe journal group is being restored from a journal.

A journal status (GRSTS) 315 is associated with the journal group 102.There are two statuses: ACTIVE and INACTIVE.

The management table includes a field to hold a sequence counter (SEQ)313. This counter serves as the source of sequence numbers used in theJournal Header 219. When creating a new journal, the sequence number 313is read and assigned to the new journal. Then, the sequence number isincremented and written back into the management table.

The number (NUM_DVOL) 314 of data volumes 101 contained in a givejournal group 102 is stored in the management table.

A data volume list (DVOL_LIST) 320 lists the data volumes in a journalgroup. In a particular implementation, DVOL_LIST is a pointer to thefirst entry of a data structure which holds the data volume information.This can be seen in FIG. 3. Each data volume information comprises anoffset number (DVOL_OFFS) 321. For example, if the journal group 102comprises three data volumes, the offset values could be 0, 1 and 2. Adata volume identifier (DVOL_ID) 322 uniquely identifies a data volumewithin the entire storage system 100. A pointer (DVOL_NEXT) 324 pointsto the data structure holding information for the next data volume inthe journal group; it is a NULL value otherwise.

The management table includes a field to store the number of journalvolumes (NUM_JVOL) 330 that are being used to contain the data (journalheader and journal data) associated with a journal group 102.

As described in FIG. 2, the Journal Header Area 210 contains the JournalHeaders 219 for each journal; likewise for the Journal Data components225. As mentioned above, an aspect of the invention is that the dataareas 210, 220 wrap. This allows for journaling to continue despite thefact that there is limited space in each data area.

The management table includes fields to store pointers to differentparts of the data areas 210, 220 to facilitate wrapping. Fields areprovided to identify where the next journal entry is to be stored. Afield (JI_HEAD_VOL) 331 identifies the journal volume 106 that containsthe Journal Header Area 210 which will store the next new Journal Header219. A field (JI_HEAD_ADR) 332 identifies an address on the journalvolume of the location in the Journal Header Area where the next JournalHeader will be stored. The journal volume that contains the Journal DataArea 220 into which the journal data will be stored is identified byinformation in a field (JI_DATA_VOL) 335. A field (JI_DATA_ADR) 336identifies the specific address in the Journal Data Area where the datawill be stored. Thus, the next journal entry to be written is “pointed”to by the information contained in the “JI_” fields 331, 332, 335, 336.

The management table also includes fields which identify the “oldest”journal entry. The use of this information will be described below. Afield (JO_HEAD_VOL) 333 identifies the journal volume which stores theJournal Header Area 210 that contains the oldest Journal Header 219. Afield (JO_HEAD_ADR) 334 identifies the address within the Journal HeaderArea of the location of the journal header of the oldest journal. Afield (JO_DATA_VOL) 337 identifies the journal volume which stores theJournal Data Area 220 that contains the data of the oldest journal. Thelocation of the data in the Journal Data Area is stored in a field(JO_DATA_ADR) 338.

The management table includes a list of journal volumes (JVOL_LIST) 340associated with a particular journal group 102. In a particularimplementation, JVOL_LIST is a pointer to a data structure ofinformation for journal volumes. As can be seen in FIG. 3, each datastructure comprises an offset number (JVOL_OFS) 341 which identifies aparticular journal volume 106 associated with a given journal group 102.For example, if a journal group is associated with two journal volumes106, then each journal volume might be identified by a 0 or a 1. Ajournal volume identifier (JVOL_ID) 342 uniquely identifies the journalvolume within the storage system 100. Finally, a pointer (JVOL_NEXT) 344points to the next data structure entry pertaining to the next journalvolume associated with the journal group; it is a NULL value otherwise.

The management table includes a list (SS_LIST) 350 of snapshot images105 associated with a given journal group 102. In this particularimplementation, SS_LIST is a pointer to snapshot information datastructures, as indicated in FIG. 3. Each snapshot information datastructure includes a sequence number (SS_SEQ) 351 that is assigned whenthe snapshot is taken. As discussed above, the number comes from thesequence counter 313. A time value (SS_TIME) 352 indicates the time whenthe snapshot was taken. A status (SS_STS) 358 is associated with eachsnapshot; valid values include VALID and INVALID. A pointer (SS_NEXT)353 points to the next snapshot information data structure; it is a NULLvalue otherwise.

Each snapshot information data structure also includes a list ofsnapshot volumes 107 (FIG. 1) used to store the snapshot images 105. Ascan be seen in FIG. 3, a pointer (SVOL_LIST) 354 to a snapshot volumeinformation data structure is stored in each snapshot information datastructure. Each snapshot volume information data structure includes anoffset number (SVOL_OFFS) 355 which identifies a snapshot volume thatcontains at least a portion of the snapshot image. It is possible that asnapshot image will be segmented or otherwise partitioned and stored inmore than one snapshot volume. In this particular implementation, theoffset identifies the i^(th) snapshot volume which contains a portion(segment, partition, etc) of the snapshot image. In one implementation,the i^(th) segment of the snapshot image might be stored in the i^(th)snapshot volume. Each snapshot volume information data structure furtherincludes a snapshot volume identifier (SVOL_ID) 356 that uniquelyidentifies the snapshot volume in the storage system 100. A pointer(SVOL_NEXT) 357 points to the next snapshot volume information datastructure for a given snapshot image.

FIG. 4 shows a flowchart highlighting the processing performed by therecovery manager 111 and Storage System 100 to initiate backupprocessing in accordance with the illustrative embodiment of theinvention as shown in the figures. If journal entries are not recordedduring the taking of a snapshot, the write operations corresponding tothose journal entries would be lost and data corruption could occurduring a data restoration operation. Thus, in accordance with an aspectof the invention, the journaling process is started prior to taking thefirst snapshot. Doing this ensures that any write operations which occurduring the taking of a snapshot are journaled. As a note, any journalentries recorded prior to the completion of the snapshot can be ignored.

Further in accordance with the invention, a single sequence of numbers(SEQ) 313 are associated with each of one or more snapshots and journalentries, as they are created. The purpose of associating the samesequence of numbers to both the snapshots and the journal entries willbe discussed below.

Continuing with FIG. 4, the recovery manager 111 might define, in a step410, a journal group (JNLG) 102 if one has not already been defined. Asindicated in FIG. 1, this may include identifying one or data volumes(DVOL) 101 for which journaling is performed, and identifying one orjournal volumes (JVOL) 106 which are used to store the journal-relatedinformation. The recovery manager performs a suitable sequence ofinteractions with the storage system 100 to accomplish this. In a step415, the storage system may create a management table 108 (FIG. 1),incorporating the various information shown in the table detail 300illustrated in FIG. 3. Among other things, the process includesinitializing the JVOL_LIST 340 to list the journal volumes whichcomprise the journal group 102 Likewise, the list of data volumesDVOL_LIST 320 is created. The fields which identify the next journalentry (or in this case where the table is first created, the firstjournal entry) are initialized. Thus, JI_HEAD_VOL 331 might identify thefirst in the list of journal volumes and JI_HEAD_ADR 332 might point tothe first entry in the Journal Header Area 210 located in the firstjournal volume. Likewise, JI_DATA_VOL 335 might identify the first inthe list of journal volumes and JI_DATA_ADR 336 might point to thebeginning of the Journal Data Area 220 in the first journal volume.Note, that the header and the data areas 210, 220 may reside ondifferent journal volumes, so JI_DATA_VOL might identify a journalvolume different from the first journal volume.

In a step 420, the recovery manager 111 will initiate the journalingprocess. Suitable communication(s) are made to the storage system 100 toperform journaling. In a step 425, the storage system will make ajournal entry for each write operation that issues from the host 110.

With reference to FIG. 3, making a journal entry includes, among otherthings, identifying the location for the next journal entry. The fieldsJI_HEAD_VOL 331 and JI_HEAD_ADR 332 identify the journal volume 106 andthe location in the Journal Header Area 210 of the next Journal Header219. The sequence counter (SEQ) 313 from the management table is copiedto (associated with) the JH_SEQ 215 field of the next header. Thesequence counter is then incremented and stored back to the managementtable. Of course, the sequence counter can be incremented first, copiedto JH_SEQ, and then stored back to the management table.

The fields JI_DATA_VOL 335 and in the management table identify thejournal volume and the beginning of the Journal Data Area 220 forstoring the data associated with the write operation. The JI_DATA_VOLand JI_DATA_ADR fields are copied to JH_JVOL 216 and to JH_ADR 212,respectively, of the Journal Header, thus providing the Journal Headerwith a pointer to its corresponding Journal Data. The data of the writeoperation is stored.

The JI_HEAD_VOL 331 and JI_HEAD_ADR 332 fields are updated to point tothe next Journal Header 219 for the next journal entry. This involvestaking the next contiguous Journal Header entry in the Journal HeaderArea 210. Likewise, the JI_DATA_ADR field (and perhaps JI_DATA_VOLfield) is updated to reflect the beginning of the Journal Data Area forthe next journal entry. This involves advancing to the next availablelocation in the Journal Data Area. These fields therefore can be viewedas pointing to a list of journal entries. Journal entries in the listare linked together by virtue of the sequential organization of theJournal Headers 219 in the Journal Header Area 210.

When the end of the Journal Header Area 210 is reached, the JournalHeader 219 for the next journal entry wraps to the beginning of theJournal Header Area. Similarly for the Journal Data 225. To preventoverwriting earlier journal entries, the present invention provides fora procedure to free up entries in the journal volume 106. This aspect ofthe invention is discussed below.

For the very first journal entry, the JO_HEAD_VOL field 333, JO_HEAD_ADRfield 334, JO_DATA_VOL field 337, and the JO_DATA_ADR field 338 are setto contain their contents of their corresponding “JI_” fields. As willbe explained the “JO_” fields point to the oldest journal entry. Thus,as new journal entries are made, the “JO_” fields do not advance whilethe “JI_” fields do advance. Update of the “JO_” fields is discussedbelow.

Continuing with the flowchart of FIG. 4, when the journaling process hasbeen initiated, all write operations issuing from the host arejournaled. Then in a step 430, the recovery manager 111 will initiatetaking a snapshot of the data volumes 101. The storage system 100receives an indication from the recovery manager to take a snapshot. Ina step 435, the storage system performs the process of taking a snapshotof the data volumes. Among other things, this includes accessing SS_LIST350 from the management table (FIG. 3). A suitable amount of memory isallocated for fields 351-354 to represent the next snapshot. Thesequence counter (SEQ) 313 is copied to the field SS_SEQ 351 andincremented, in the manner discussed above for JH_SEQ 215. Thus, overtime, a sequence of numbers is produced from SEQ 313, each number in thesequence being assigned either to a journal entry or a snapshot entry.

The snapshot is stored in one (or more) snapshot volumes (SVOL) 107. Asuitable amount of memory is allocated for fields 355-357. Theinformation relating to the SVOLs for storing the snapshot are thenstored into the fields 355-357. If additional volumes are required tostore the snapshot, then additional memory is allocated for fields355-357.

FIG. 5 illustrates the relationship between journal entries andsnapshots. The snapshot 520 represents the first snapshot image of thedata volumes 101 belonging to a journal group 102. Note that journalentries (510) having sequence numbers SEQ0 and SEQ1 have been made, andrepresent journal entries for two write operations. These entries showthat journaling has been initiated at a time prior to the snapshot beingtaken (step 420). Thus, at a time corresponding to the sequence numberSEQ2, the recovery manager 111 initiates the taking of a snapshot, andsince journaling has been initiated, any write operations occurringduring the taking of the snapshot are journaled. Thus, the writeoperations 500 associated with the sequence numbers SEQ3 and higher showthat those operations are being journaled. As an observation, thejournal entries identified by sequence numbers SEQ0 and SEQ1 can bediscarded or otherwise ignored.

Recovering data typically requires recover the data state of at least aportion of the data volumes 101 at a specific time. Generally, this isaccomplished by applying one or more journal entries to a snapshot thatwas taken earlier in time relative to the journal entries. In thedisclosed illustrative embodiment, the sequence number SEQ 313 isincremented each time it is assigned to a journal entry or to asnapshot. Therefore, it is a simple matter to identify which journalentries can be applied to a selected snapshot; i.e., those journalentries whose associated sequence numbers (JH_SEQ, 215) are greater thanthe sequence number (SS_SEQ, 351) associated with the selected snapshot.

For example, the administrator may specify some point in time,presumably a time that is earlier than the time (the “target time”) atwhich the data in the data volume was lost or otherwise corrupted. Thetime field SS_TIME 352 for each snapshot is searched until a timeearlier than the target time is found. Next, the Journal Headers 219 inthe Journal Header Area 210 is searched, beginning from the “oldest”Journal Header. The oldest Journal Header can be identified by the “JO_”fields 333, 334, 337, and 338 in the management table. The JournalHeaders are searched sequentially in the area 210 for the first headerwhose sequence number JH_SEQ 215 is greater than the sequence numberSS_SEQ 351 associated with the selected snapshot. The selected snapshotis incrementally updated by applying each journal entry, one at a time,to the snapshot in sequential order, thus reproducing the sequence ofwrite operations. This continues as long as the time field JH_TIME 214of the journal entry is prior to the target time. The update ceases withthe first journal entry whose time field 214 is past the target time.

In accordance with one aspect of the invention, a single snapshot istaken. All journal entries subsequent to that snapshot can then beapplied to reconstruct the data state at a given time. In accordancewith another aspect of the present invention, multiple snapshots can betaken. This is shown in FIG. 5A where multiple snapshots 520′ are taken.In accordance with the invention, each snapshot and journal entry isassigned a sequence number in the order in which the object (snapshot orjournal entry) is recorded. It can be appreciated that there typicallywill be many journal entries 510 recorded between each snapshot 520′.Having multiple snapshots allows for quicker recovery time for restoringdata. The snapshot closest in time to the target recovery time would beselected. The journal entries made subsequent to the snapshot could thenbe applied to restore the desired data state.

FIG. 6 illustrates another aspect of the present invention. Inaccordance with the invention, a journal entry is made for every writeoperation issued from the host; this can result in a rather large numberof journal entries. As time passes and journal entries accumulate, theone or more journal volumes 106 defined by the recovery manager 111 fora journal group 102 will eventually fill up. At that time no morejournal entries can be made. As a consequence, subsequent writeoperations would not be journaled and recovery of the data statesubsequent to the time the journal volumes become filled would not bepossible.

FIG. 6 shows that the storage system 100 will apply journal entries to asuitable snapshot in response to detection of an “overflow” condition.An “overflow” is deemed to exist when the available space in the journalvolume(s) falls below some predetermined threshold. It can beappreciated that many criteria can be used to determine if an overflowcondition exists. A straightforward threshold is based on the totalstorage capacity of the journal volume(s) assigned for a journal group.When the free space becomes some percentage (say, 10%) of the totalstorage capacity, then an overflow condition exists. Another thresholdmight be used for each journal volume. In an aspect of the invention,the free space capacity in the journal volume(s) is periodicallymonitored. Alternatively, the free space can be monitored in anaperiodic manner. For example, the intervals between monitoring can berandomly spaced. As another example, the monitoring intervals can bespaced apart depending on the level of free space; i.e., the monitoringinterval can vary as a function of the free space level.

FIG. 7 highlights the processing which takes place in the storage system100 to detect an overflow condition. Thus, in a step, 710, the storagesystem periodically checks the total free space of the journal volume(s)106; e.g., every ten seconds. The free space can easily be calculatedsince the pointers (e.g., JI_CTL_VOL 331, JI_CTL_ADDR 332) in themanagement table 300 maintain the current state of the storage consumedby the journal volumes. If the free space is above the threshold, thenthe monitoring process simply waits for a period of time to pass andthen repeats its check of the journal volume free space.

If the free space falls below a predetermined threshold, then in a step720 some of the journal entries are applied to a snapshot to update thesnapshot. In particular, the oldest journal entry(ies) are applied tothe snapshot.

Referring to FIG. 3, the Journal Header 219 of the “oldest” journalentry is identified by the JO_HEAD_VOL field 333 and the JO_HEAD_ADRfield 334. These fields identify the journal volume and the location inthe journal volume of the Journal Header Area 210 of the oldest journalentry. Likewise, the Journal Data of the oldest journal entry isidentified by the JO_DATA_VOL field 337 and the JO_DATA_ADR field 338.The journal entry identified by these fields is applied to a snapshot.The snapshot that is selected is the snapshot having an associatedsequence number closest to the sequence number of the journal entry andearlier in time than the journal entry. Thus, in this particularimplementation where the sequence number is incremented each time, thesnapshot having the sequence number closest to but less than thesequence number of the journal entry is selected (i.e., “earlier intime). When the snapshot is updated by applying the journal entry to it,the applied journal entry is freed. This can simply involve updating theJO_HEAD_VOL field 333, JO_HEAD_ADR field 334, JO_DATA_VOL field 337, andthe JO_DATA_ADR field 338 to the next journal entry.

As an observation, it can be appreciated by those of ordinary skill,that the sequence numbers will eventually wrap, and start counting fromzero again. It is well within the level of ordinary skill to provide asuitable mechanism for keeping track of this when comparing sequencenumbers.

Continuing with FIG. 7, after applying the journal entry to the snapshotto update the snapshot, a check is made of the increase in the journalvolume free space as a result of the applied journal entry being freedup (step 730). The free space can be compared against the thresholdcriterion used in step 710. Alternatively, a different threshold can beused. For example, here a higher amount of free space may be required toterminate this process than was used to initiate the process. Thisavoids invoking the process too frequently, but once invoked the secondhigher threshold encourages recovering as much free space as isreasonable. It can be appreciated that these thresholds can bedetermined empirically over time by an administrator.

Thus, in step 730, if the threshold for stopping the process is met(i.e., free space exceeds threshold), then the process stops. Otherwise,step 720 is repeated for the next oldest journal entry. Steps 730 and720 are repeated until the free space level meets the thresholdcriterion used in step 730.

FIG. 7A highlights sub-steps for an alternative embodiment to step 720shown in FIG. 7. Step 720 frees up a journal entry by applying it to thelatest snapshot that is not later in time than the journal entry.However, where multiple snapshots are available, it may be possible toavoid the time consuming process of applying the journal entry to asnapshot in order to update the snapshot.

FIG. 7A shows details for a step 720′ that is an alternate to step 720of FIG. 7. At a step 721, a determination is made whether a snapshotexists that is later in time than the oldest journal entry. Thisdetermination can be made by searching for the first snapshot whoseassociated sequence number is greater than that of the oldest journalentry. Alternatively, this determination can be made by looking for asnapshot that is a predetermined amount of time later than the oldestjournal entry can be selected; for example, the criterion may be thatthe snapshot must be at least one hour later in time than the oldestjournal entry. Still another alternate is to use the sequence numbersassociated with the snapshots and the journal entries, rather than time.For example, the criterion might be to select a snapshot whose sequencenumber is N increments away from the sequence number of the oldestjournal entry.

If such a snapshot can be found in step 721, then the earlier journalentries can be removed without having to apply them to a snapshot. Thus,in a step 722, the “JO_” fields (JO_HEAD_VOL 333, JO_HEAD_ADR 334,JO_DATA_VOL 337, and JO_DATA_ADR 338) are simply moved to a point in thelist of journal entries that is later in time than the selectedsnapshot. If no such snapshot can be found, then in a step 723 theoldest journal entry is applied to a snapshot that is earlier in timethan the oldest journal entry, as discussed for step 720.

Still another alternative for step 721 is simply to select the mostrecent snapshot. All the journal entries whose sequence numbers are lessthan that of the most recent snapshot can be freed. Again, this simplyinvolves updating the “JO_” fields so they point to the first journalentry whose sequence number is greater than that of the most recentsnapshot. Recall that an aspect of the invention is being able torecover the data state for any desired point in time. This can beaccomplished by storing as many journal entries as possible and thenapplying the journal entries to a snapshot to reproduce the writeoperations. This last embodiment has the potential effect of removinglarge numbers of journal entries, thus reducing the range of time withinwhich the data state can be recovered. Nevertheless, for a particularconfiguration it may be desirable to remove large numbers of journalentries for a given operating environment.

Another aspect of the present invention is the ability to place a“marker” among the journal entries. In accordance with an illustrativeembodiment of this aspect of the invention, an application programminginterface (API) can be provided to manipulate these markers, referred toherein as marker journal entries, marker journals, etc. Marker journalscan be created and inserted among the journal entries to note actionsperformed on the data volume (production volume) 101 or events ingeneral (e.g., system boot up). Marker journals can be searched and usedto identify previously marked actions and events. The API can be used byhigh-level (or user-level) applications. The API can include functionsthat are limited to system level processes.

FIG. 8 shows additional detail in the block diagram illustrated inFIG. 1. A Management Program (MP) 811 component comprises a Manager 814and a Driver 813. The Driver component provides a set of API's toprovide journaling functions implemented in storage system 100 inaccordance with this aspect of the invention. The Manager componentrepresents an example of an application program that uses the API'sprovided by the Driver component. As will be discussed below, userapplications 112 can use parts of the API provided by the Driver.Following is a usage exemplar, illustrating the basic functionalityprovided by an API in accordance with the present invention.

The Manager component 814 can be configured to monitor operations on allor parts of a data volume (production data store) 101 such as adatabase, a directory, one or more files, or other objects of a the filesystem. A user can be provided with access to the Manager via a suitableinterface; e.g., command line interface, GUI, etc. The user can interactwith the Manager to specify objects and operations on those objects tobe monitored. When the Manager detects a specified operation on theobject, it calls an appropriate marker journal function via the API tocreate a marker journal to mark the event or action. Among other things,the marker journal can include information such as a filename, thedetected operation, the name of the host 110, and a timestamp.

The Driver component 813 can interact with the storage system 100accordingly to create the marker. In response, the storage system 100creates the marker journal in the same manner as discussed above forjournal entries associated with write operations. Referring for a momentto FIG. 2, the journal type field (JH_TYPE) 218 can be set to MARKER toindicate that journal entry is a marker journal. Journal entriesassociated with write operations would have a field value of INTERNAL.Any information that is associated with the marker journal entry can bestored in the journal data area of the journal entry.

FIG. 9 illustrates an example for implementing an API based on a storagesystem 100 that implements the SCSI (small computer system interface)standard. A special device, referred to herein as a command device (CMD)902, can be defined in the storage system 100. When the Driver component813 issues a read request or a write request to the CMD device, thestorage system 100 can intercept the request and treat it as a specialcommand. For example, a write request to the CMD device can contain data(write data) that indicates a function relating to a marker journal suchas creating a marker journal. Other functions will be discussed below.The write data can include marker information such as time range,filename, operation, and so on.

With a write command, the Manager component 814 can also specify to readspecial information from the storage system 100. In this case, the writecommand indicates information to be read, and following a read commandto the CMD device 902 actually reads the information. Thus, for example,a pair of write and read requests to the CMD device can be used toretrieve a marker journal entry and the data associated with the markerjournal.

An alternative implementation is to extend the SCSI command set. Forexample, the SCSI standard allows developers to extend the SCSI commoncommand set (CCS) which describes the core set of commands supported bySCSI. Thus, special commands can be defined to provide the APIfunctionality. From these implementation examples, one of ordinary skillin the relevant arts can readily appreciate that other implementationsare possible.

FIG. 10 illustrates the interaction among the components shown inFIG. 1. A user 1002 on the host 110 can interact via a suitable API withthe Manager component 814 or directly with the Driver component 813 tomanipulate marker journals. The user can be an application level user ora system administrator. The “user” can be a machine that is suitablyinterfaced to the Manager component and/or the Driver component.

The Manager component 814 can provide its own API 814 a to the user1002. The functions provided by this API can be similar to the markerjournal functions provided by the API 813 a of the Driver component 813.However, since the Manager component provides a higher level offunctionality, its API is likely to include functions not needed formanaging marker journals. It can be appreciated that in otherembodiments of the invention, a single API can be defined which includesthe functionality of API's 813 a and 814 a.

The Driver component 813 communicates with the storage system 100 toinitiate the desired action. As illustrated in FIG. 10, typical actionsinclude, among others, generating marker journals, periodicallyretrieving journal entries, and recovery using marker journals.

Following is a list of functions provided by the API's according to anembodiment of the present invention:

GENERATE MARKER

-   -   This function will generate a marker journal entry. This        function can be invoked by the user or by the Manager component        114 to generate a marker journal. The following information can        be provided:        -   1. operation—this specifies a data operation that is being            performed on the object; e.g., deletion, re-format, closing            a file, renaming, etc. It is possible that no data operation            is specified. The user may simply wish to create a marker            journal to identify the data state of the data volume 101 at            some point in time.        -   2. timestamp        -   3. object name, e.g., filename, volume name, a database            identifier, etc.        -   4. hostname        -   5. host IP Address        -   6. comments    -   The GENERATE MARKER request is sent through the Driver component        113 to the storage system 100. The storage system performs the        following:        -   1. Assign the next number in the sequence number SEQ 313 to            the marker. In addition, a time value can be placed in the            JH_TIME 214 field, thus associating a time of creation with            the marker journal.        -   2. Store the marker on the journal volume JVOL 106. The            accompanying information is stored in the journal data area            225.    -   The created marker journal entry is now inserted, in timewise        sequence, into the list of journal entries.    -   GET MARKER    -   Retrieve one or more marker journal entries by specifying at        least one or more of the following retrieval criteria:        -   1. time—This can be a range of times, or a single time            value. If a single time value is provided, the marker            journals prior to the time value or subsequent to the time            value can be retrieved. Some convention would be required to            specify whether prior-in-time marker journals are obtained,            or subsequent-in-time marker journals are obtained; e.g., a            “+” sign and a “−” sign can be used.        -   2. object name, e.g., filename, volume name, a database            identifier, etc.        -   3. operation—A specific operation can be used to specify            which marker journal(s) to obtain.    -   Generally, any of the data in the marker journal entry can be        used as the retrieval criterion (a). For example, it may be        desirable to allow a user to search the “comment” that is stored        with the marker journal.    -   The following information from the retrieved marker journals can        be obtained, although it is understood that any information        associated with the marker journal can be obtained.        -   sequence number        -   timestamp        -   other information in journal data area 225    -   READ HEADER        -   The next two function allow a user to see makers stored to            the journal volume JVOL106 at any time. The Driver 813            searches markers that a user wants to see. In order to speed            up the search, Driver 813 periodically reads journal headers            219, finds markers, reads journal data 225, and stores them            to a file. This stores all the markers to a file in advance.        -   This function obtains the header portion of a marker journal            entry.        -   A sequence number is provided to identify which journal            header to read next. This is used to calculate the location            of the first header.        -   The number of journal headers is provided to indicate how            many journal headers are to be communicated to the driver            813.    -   READ JOURNAL    -   This function reads the journal header.    -   A sequence number is provided to identify which journal header        to read next. This is used to calculate the location of the        first header.    -   The location and length of the journal data are obtained from        the JH_JNL 216, JH_JADR 217 and JH_LEN 213 fields. This        information determines how much data is in a given marker        journal.    -   INVOKE RECOVERY    -   This invokes a recovery action. A user can invoke recovery using        the following parameters:        -   timestamp as the recovery target time, or        -   sequence number as the recovery target time.

Objects can be monitored for certain actions. For example, the Managercomponent 814 can be configured to monitor the data volume 101 foruser-specified activity (data operations) to be performed on objectscontained in the volume. The object can be the entire volume, a filesystem or portions of a file system. The object can include applicationobjects such as files, database components, and so on. Activitiesinclude, among others, closing a file, removing an object, manipulation(creation, deletion, etc) of symbolic links to files and/or directories,formatting all or a portion of a volume, and so on.

A user can specify which actions to detect. When the Manager 814 detectsa specified operation, the Manager can issue a GENERATE MARKER requestto mark the event. Similarly, the user can specify an action or actionsto be performed on an object or objects. When the Manager detects aspecified action on a specified object, a GENERATE MARKER request can beissued to mark the occurrence of that event.

The user can also mark events that take place within the data volume101. For example, when the user shuts down the system, she might issue aSYNC command (in the case of a UNIX OS) to sync the file system and alsoinvoke the GENERATE MARKER command to mark the event of syncing the filesystem. She might mark the event of booting up the system. It can beappreciated that the Manager component 114 can be configured to detectand automatically act on these events as well. It is observed that anevent can be marked before or after the occurrence of the event. Forexample, the actions of deleting a file or SYNC'ing a file systemprobably are preferably performed prior to marking the action. If amajor update of a data file or a database is about to be performed, itmight be prudent to create a marker journal before proceeding; this canbe referred to as “pre-marking” the event.

The foregoing mechanisms for manipulating marker journals can be used tofacilitate recovery. For example, suppose a system administratorconfigures the Manager component 814 to mark every “delete” operationthat is performed on “file” objects. Each time a user in the host 110performs a file delete, a marker journal entry can be created (using theGENERATE MARKER command) and stored in the journal volume 106. Thisoperation is a type where it might be desirable to “pre-mark” each suchevent; that is, a marker journal entry is created prior to carrying outthe delete operation to mark a point in time just prior to theoperation. Thus, over time, the journal entries contained in the journalvolumes will be sprinkled with marker journal entries identifying pointsin time prior to each file deletion operation.

If a user later wishes to recover an inadvertently deleted file, themarker journals can be used to find a suitable recovery point. Forexample, the user is likely to know roughly when he deleted a file. AGET MARKER command that specifies a time prior to the estimated time ofdeletion and further specifying an operation of “delete” on objects of“file” with the name of the deleted file as an object can be issued tothe storage system 100. The matching marker journal entry is thenretrieved. This journal entry identifies a point in time prior to thedelete operation, and can then serve as the recovery point for asubsequent recovery operation. As can be seen in FIG. 2, all journalentries, including marker journals, have a sequence number. Thus, thesequence number of the retrieved marker journal entry can be used todetermine the latest journal entry just prior to the deletion action. Asuitable snapshot is obtained and updated with journal entries of typeINTERNAL, up to the latest journal entry. At that point, the data stateof the volume reflects the time just before the file was deleted, thusallowing for the deleted file to be restored.

FIG. 11 illustrates recovery processing according to an illustrativeembodiment of the present invention. The storage system 100 determinesin a step 1110 whether recovery is possible. A snapshot must have beentaken between the oldest journal entry and latest journal entry. Asdiscussed above, every snapshot has a sequence number taken from thesame sequence of numbers used for the journal entries. The sequencenumber can be used to identify a suitable snapshot. If the sequencenumber of a candidate snapshot is greater than that of the oldestjournal and smaller than that of the latest journal, then the snapshotis suitable.

Then in a step 1120, the recovery volume is set to an offline state. Theterm “recovery volume” is used in a generic sense to refer to one ormore volumes on which the data recovery process is being performed. Inthe context of the present invention, “offline” is taken to mean thatthe user, and more generally the host device 110, cannot access therecovery volume. For example, in the case that the production volume isbeing used as the recovery volume, it is likely to be desirable that thehost 110 be prevented at least from issuing write operations to thevolume. Also, the host typically will not be permitted to perform readoperations. Of course, the storage system itself has full access to therecovery volume in order to perform the recovery task.

In a step 1130, the snapshot is copied to the recovery volume inpreparation for the recovery operation. The production volume itself canbe the recovery volume. However, it can be appreciated that the recoverymanager 111 can allow the user to specify a volume other than theproduction volume to serve as the target of the data recovery operation.For example, the recovery volume can be the volume on which the snapshotis stored. Using a volume other than the production volume to performthe recovery operation may be preferred where it is desirable to providecontinued use of the production volume.

In a step 1140, one or more journal entries are applied to update thesnapshot volume in the manner as discussed previously. Enough journalentries are applied to update the snapshot to a point in time just priorto the occurrence of the file deletion. At that point the recoveryvolume can be brought “online.” In the context of the present invention,the “online” state is taken to mean that the host device 110 is givenaccess to the recovery volume.

Referring again to FIG. 10, according to another aspect of theinvention, periodic retrievals of marker journal entries can be made andstored locally in the host 110 using the GET MARKER command andspecifying suitable criteria. For example, the Driver component 813might periodically issue a GET MARKER for “delete” operations performedon “file” objects. Other retrieval criteria can be specified. Having alocally accessible copy of certain marker journals reduces delay inretrieving one marker journal at a time from the storage system 100.This can greatly speed up a search for a recovery point.

From the foregoing, it can be appreciated that the API definition can bereadily extended to provide additional functionality. The disclosedembodiments typically can be provided using a combination of hardwareand software implementations; e.g., combinations of software, firmware,and/or custom logic such as ASICs (application specific ICs) arepossible. One of ordinary skill can readily appreciate that theunderlying technical implementation will be determined based on factorsincluding but not limited to or restricted to system cost, systemperformance, the existence of legacy software and legacy hardware,operating environment, and so on. The disclosed embodiments can bereadily reduced to specific implementations without undueexperimentation by those of ordinary skill in the relevant art.

1. A method for accessing data contained in a data store comprising:detecting a user-request to perform an operation on an object stored ina data store and in response thereto communicating a request to the datastore to perform the operation and communicating a marker request to thedata store, the marker request including information indicative of theoperation and the object, wherein the marker request produces a markerjournal entry; detecting a user-request to retrieve a specified markerjournal entry and in response thereto communicating a request to thedata store to retrieve the specified marker journal entry; and detectinga user-request to perform a recovery operation and in response theretocommunicating a recovery request to the data store to restore a datastate of the data store, the user-request including informationincluding a target time of the data state, the target time being basedon a time associated with a previously retrieved marker journal entry.2. The method of claim 1 wherein the user-request to retrieve aspecified marker journal entry includes information indicating at leastone of a target time, an operation, and an object name.
 3. The method ofclaim 1 further comprising obtaining the previously retrieved markerjournal entry based on one of an operation on an object and an objectname.
 4. The method of claim 1 further comprising retrieving a pluralityof marker journal entries and presenting one or more of the markerjournal entries to a user, wherein the previously retrieved markerjournal entry is a user selected one of the marker journal entries. 5.The method of claim 1 wherein the marker journal entries are retrievedperiodically over a span of time.
 6. A method for processing data on adata store comprising: receiving user-requests for operations to beperformed on a data store; for each user-request, communicating one ormore requests to the data store to perform the user-request; monitoringthe user-requests; and if a user-request is a predetermined operation,then communicating a marker journal request to the data store inaddition to communicating the one or more requests, thereby creating amarker journal entry to mark a time of occurrence of the predeterminedoperation, wherein the marker journal request includes informationrepresentative of the predetermined operation, wherein communicating amarker journal request includes invoking first application programinterface (API) program code to transmit the marker journal request tothe data store.
 7. The method of claim 6 further comprising receiving auser-request to retrieve a marker journal entry and in response theretocommunicating a marker retrieval request to the data store, wherein themarker retrieval request includes one or more retrieval criteria,wherein the communicating includes invoking second API program code totransmit the marker retrieval request to the data store.
 8. The methodof claim 7 further comprising receiving a retrieved marker journal entryfrom the data store and storing the retrieved marker journal entry,wherein the retrieved marker journal entry satisfies the one or moreretrieval criteria.
 9. The method of claim 8 further comprisingcommunicating additional marker retrieval requests to the data store andstoring additional retrieved marker journal entries.
 10. The method ofclaim 6 further comprising receiving user-information indicative of oneof more predetermined operations to be monitored.
 11. Method forprocessing data contained in a data store comprising: receivinguser-requests for operations to be performed on a data store; for eachuser-request, communicating one or more associated requests to the datastore to perform the user-request; for at least some of theuser-requests, communicating a marker journal request to the data storein addition to communicating the one or more associated requests,thereby creating one or more marker journal entries to mark a time ofoccurrence of some of the user-requests; retrieving one or more firstmarker journal entries from the data store, based on one or moreretrieval criteria; displaying the first marker journal entries;receiving a user-selected one of the first marker journal entries; andperforming a recovery operation based on a target time associated withthe user-selected one of the first marker journal entries.
 12. Themethod of claim 11 wherein communicating a marker journal requestincludes invoking first API program code to communicate with the datastore.
 13. The method of claim 12 wherein retrieving one or more firstmarker journal entries includes performing one or more invocations ofsecond API program code to communicate with the data store.
 14. Themethod of claim 13 wherein performing a recovery operation includesperforming one or more invocations of third API program code tocommunicate with the data store.
 15. The method of claim 11 furthercomprising receiving user-information representative of the at leastsome of the user-requests.
 16. The method of claim 15 wherein theuser-information includes one or more of an operation to be performed inthe data store and an object contained in the data store.
 17. A methodfor processing data in a data store comprising: producing one or moresnapshots of a data store; detecting write requests directed to the datastore and in response thereto producing journal entries corresponding tothe write requests, wherein the journal entries can be applied to one ofthe snapshots to recreate one or more data states of the data store;detecting a marker request and in response thereto producing a markerjournal entry, wherein the journal entries and the marker journalentries are ordered according to the time of their respective writerequests and marker requests; detecting a request to retrieve aspecified marker journal entry and in response thereto accessing thespecified marker journal entry; and detecting a request to perform arecovery operation, the request including a target time based on a timeassociated with a previously retrieved marker journal entry.
 18. Themethod of claim 17 further comprising assigning a sequence number toeach journal entry and to the marker journal entry in the order in whichthe entries are produced.
 19. The method of claim 17 wherein the markerrequest is detected as part of performing a predetermined operation onan object stored on the data store.
 20. Computer apparatus forprocessing data contained in a data store comprising: a data processingcomponent; a communication component configured to communicate between ahost device and a data store; and computer program code configured tooperate one or more of the data processing component and thecommunication component to perform steps of: communicating markerjournal requests to the data store, to create a plurality of markerjournals; communicating marker retrieval requests to the data store, toretrieve one or more of the marker journal entries; and communicating adata recovery request to the data store, to perform a recovery operationto recover a data state in the data store; wherein the computer programcode is configured as an application programming interface (API) toallow an application program to perform one or more of the steps ofcommunicating.
 21. The computer apparatus of claim 20 wherein eachmarker journal request includes information indicative of one of anobject contained in the data store and an operation to be performed onan object contained in the data store.
 22. The computer apparatus ofclaim 20 wherein the marker retrieval requests are based on informationassociated with the marker journal entries.
 23. The computer apparatusof claim 20 wherein the data recovery request includes a target timeindicative of the data state to be recovered.
 24. The computer apparatusof claim 23 wherein the target time is based on a time associated with apreviously retrieved marker journal entry.
 25. A computer programproduct for processing data on a data store comprising: a storagecomponent having stored therein computer program code, the computerprogram code comprising an application program interface (API), the APIcomprising: a first API component configured to allow execution of firstprogram code, the first program code configured to communicate a makerjournal request to a data store to create a marker journal entry, themarker journal request including marker information indicative of one ormore of an object contained in the data store and an operation on anobject contained in the data store, the marker information beingassociated with the marker journal entry; a second API componentconfigured to allow execution of second program code, the second programcode configured to communicate a marker retrieval request to the datastore to retrieve at least one marker journal entry, the markerretrieval request including retrieval criteria based on the markerinformation; and a third API component configured to allow execution ofthird program code, the third program code configured to communicate arecovery request to the data store to recover a data state of the datastore.
 26. The computer program product of claim 25 wherein the recoveryrequest includes a target time that is based on a time associated with apreviously retrieved marker journal entry.
 27. The computer programproduct of claim 25 wherein the API further comprises a fourth APIcomponent configured to allow execution of fourth program code, thefourth program code configured to monitor one or more operations on oneor more objects contained in the data store.
 28. The computer programproduct of claim 27 wherein the API further comprises a fifth APIcomponent configured to allow execution of fifth program code, the fifthprogram code configured to communicate a marker retrieval request to thedata store to retrieve a marker journal entry.
 29. The computer programproduct of claim 28 wherein the fifth program code is further configuredto communicate a plurality of marker retrieval requests to retrieve aplurality of retrieved marker journal entries, wherein the recoveryrequest includes a target time that is based on a time associated withone of the retrieved marker journal entries.
 30. The computer programproduct of claim 27 wherein the API further comprises: a fifth APIcomponent configured to allow execution of fifth program code, the fifthprogram code configured to communicate a plurality of marker retrievalrequests to the data store to retrieve a plurality of marker journalentries; and a sixth API component configured to allow execution ofsixth program code, the sixth program code configured to display theplurality of marker journal entries, wherein the recovery requestincludes a target time that is based on a time associated with one ofthe retrieved marker journal entries.